Objective intelligibility assessment of text-to-speech systems through utterance verification
نویسندگان
چکیده
Objective assessment of synthetic speech intelligibility can be a useful tool for the development of text-to-speech (TTS) systems, as it provides a reproducible and inexpensive alternative to subjective listening tests. In a recent work, it was shown that the intelligibility of synthetic speech could be assessed objectively by comparing two sequences of phoneme class conditional probabilities, corresponding to instances of synthetic and human reference speech, respectively. In this paper, we build on those findings to propose a novel approach that formulates objective intelligibility assessment as an utterance verification problem using hidden Markov models, thereby alleviating the need for human reference speech. Specifically, given each text input to the TTS system, the proposed approach automatically verifies the words in the output synthetic speech signal and estimates an intelligibility score based on word recall statistics. We evaluate the proposed approach on the 2011 Blizzard Challenge data, and show that the estimated scores and the subjective intelligibility scores are highly correlated (Pearson’s |R| = 0.94).
منابع مشابه
Utterance Verification for Automating the Hearing in Noise Test (HINT)
Tests of speech intelligibility play an essential role in many audiological procedures, including diagnostic assessment, verification of hearing aid and cochlear implant fittings, outcome assessment following intervention, and screening of applicants for hearing-critical jobs. The Hearing In Noise Test (HINT) [1] is a speech intelligibility test commonly used for these purposes. A limitation of...
متن کاملCombining Phonological and Acoustic ASR-Free Features for Pathological Speech Intelligibility Assessment
Intelligibility is widely used to measure the severity of articulatory problems in pathological speech. Recently, a number of automatic intelligibility assessment tools have been developed. Most of them use automatic speech recognizers (ASR) to compare the patient’s utterance with the target text. These methods are bound to one language and tend to be less accurate when speakers hesitate or mak...
متن کاملObjective Intelligibility Assessment of Text-to-Speech System using Template Constrained Generalized Posterior Probability
Speech intelligibility is one of the most important measures in evaluating text-to-speech (TTS) synthesizer. In this paper, we propose an automatic objective intelligibility measure for evaluating synthesized speech using template constrained generalized posterior probability (TCGPP). TCGPP is a posterior probability based confidence measure, which has the advantage to identify small granularit...
متن کاملCipher text only attack on speech time scrambling systems using correction of audio spectrogram
Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...
متن کاملSpeech intelligibility after repair of cleft lip and palate
Background: Intelligibility refers to understandability of speech; and lack of it can negatively affect children’s overall communication effectiveness. Children with repaired cleft lip and/or cleft palate (CL/P) may experience poor speech intelligibility. This study aimed at evaluating speech intelligibility in children with repaired CL/P who had not been referred to sp...
متن کامل